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ABSTRACT 

We compute population synthesis models for the variation of Ha absorption indices 
(HaA and Hap), as a function of age and metallicity in old stellar systems. The models 
are based on the STELIB spectral library of J.-F. Le Borgne et al., and defined at a 
resolution of 3 A FWHM. Errors in the age and metallicity responses are derived by 
bootstrap resampling the input measurements on the stellar library. The indices are 
found to be highly sensitive to age variation, with only moderate response to metallic- 
ity. For galaxies uncontaminated by nebular emission, our HaA index is more powerful 
in breaking the age-metallicity degeneracy than H/3 or H7F. Using a sample of red 
cluster galaxies from Nelan et al., carefully selected to exclude objects with emission, 
we find a steep decline of Hqa with velocity dispersion (slope —0.75 ± 0.07 Adex -1 ). 
The slope can be translated to constraints on age and metallicity scaling relations, 
incorporating measurement errors and also the model errors determined from the 
bootstrap method. If the HaA— c slope is due only to age, we obtain Age oc er - 9 ^ 012 . 
Because HaA depends quite weakly on [Fe/H], a metallicity interpretation would re- 
quire Fe/Hoc a 1 ' 2 or steeper. The HaA— cr slope is consistent with the combined age 
and metallicity scaling relations reported by Nelan et al. from classical Lick indices. 
The relations obtained by Thomas et al. significantly undcr-predict the observed slope. 
The discrepancy could arise from differences in the sample selection. In particular our 
sample probes a lower mass range, is not explicitly selected on morphological criteria 
and excludes objects significantly bluer than the red sequence. We discuss in detail the 
effects of emission contamination on the results, and conclude that these are unlikely 
to yield the observed behaviour in the Ha— a relations. Indeed, similar results are 
obtained using HaF, despite its different sensitivity to Ha and [Nil] emission lines. 
The steep age-mass relation supports a "downsizing" formation scenario: fainter red- 
sequence galaxies became quiescent at lower redshifts, z<0.5. This picture accords 
with recent observations of truncated red sequences in clusters at z ~ 0.7. 
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1 INTRODUCTION 

In the integrated spectra of old stellar systems, the strength 
of the Balmer absorption lines reflects the luminosity- 
weighted effective temperature, which is dominated by the 
Main Sequence turn-off. In a simple (i.e. single-age, single- 
metallicity) system, the turn-off luminosity and temperature 
are sensitive primarily to the age of the population. On this 
basis, measurements of H/3 and H7 line strengths have been 
widely used to constrain the ages of early-type galaxies both 
in clusters and in the field (see Thomas et al. 2004b, and 
many references therein). Despite extensive work, some fun- 
damental questions remain unresolved, such as the influence 



of age variations in driving the "red sequence" of cluster el- 
lipticals (e.g. Caldwell 2003; Thomas et al. 2004b; Nelan et 
al. 2005). 

The interpretation of the Balmer lines suffers from at 
least three complications. First, horizontal branch morphol- 
ogy is not included in most stellar population models; if 
blue Horizontal Branch stars are present, they enhance the 
Balmer lines and mimic the effect of younger ages (Maras- 
ton & Thomas 2000; Thomas et al. 2004b). Second, the in- 
dices are measured on low effective-resolution spectra (usu- 
ally limited by the internal velocity dispersion of the galaxy 
itself). This causes the Balmer absorption and neighbour- 
ing continua to be blended with a forest of nearby metal 
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Table 1. Ha index definitions from Nelan et al. (2005). 



Index 


Blue continuum 


Central band 


Red continuum 


Hqa 


6515-6540 


6554-6575 


6575-6585 


Hap 


6515-6540 


6554-6568 


6568-6575 



absorption lines. Thus rather than true equivalent widths, 
one measures "indices" (such as those of the Lick system, 
see Burstein et al. 1984), which have residual sensitivity to 
metallicity and abundance ratios. Third, many early-type 
galaxies exhibit weak nebular emission which "fills in" the 
stellar absorption, leading to over-estimates of the popula- 
tion ages (e.g. Gonzalez 1993). The contamination is reduced 
for the higher-order lines, H7 and H5, as a result of the typi- 
cal line ratios for nebular emission. Unfortunately, however, 
the usual index definitions for these lines are highly contam- 
inated by metal-line blanketing, and depend more strongly 
on the overall metallicity (Worthey & Ottoviani 1997; but 
see Vazdekis & Arimoto 1999). Moreover, they are sensitive 
to the enhancement of a-elements, as shown by Thomas, 
Maraston & Korn (2004a). By contrast, the low-order line, 
H/3 is a less ambiguous tracer of age, if the nebular emission 
problem can be overcome. 

In this paper, we explore whether these trends extend 
to the lowest-order Balmer line, Ha. In Section |2] we re- 
view two index definitions, Hqa and Hlvf, introduced by 
Nelan et al. (2005) (Section 12. in . describe the population 
synthesis model used (Section 12.21 . and determine the sen- 
sitivity of the indices to age and metallicity (Section 12. 3H . 
Section|3]confronts the synthesis results with measurements 
for red-sequence galaxies in the NOAO Fundamental Plane 
Survey (Smith et al. 2004; Nelan et al. 2005). The observed 
Ha— log a relations for a low-emission subsample are de- 
rived in Section [3. II while Section f3 . 21 discusses the implied 
age and metallicity variations. In Section 13.31 we incorpo- 
rate errors in the population synthesis model, using a boot- 
strap method. Section f3. 41 describes the problem of nebular 
emission contamination; other potential complications are 
discussed in Section [3.51 On balance, the steep slope of the 
observed Ha— log a relations is best reproduced by a strong 
gradient in age, with low-mass galaxies on average younger 
than larger objects. We discuss the plausibility of this pic- 
ture in Section^] comparing to observations at intermediate 
redshifts. The main conclusions are summarized in Section^ 



2 Ha POPULATION SYNTHESIS 
2.1 Index definitions 

Throughout this paper, we work with the two index defini- 
tions Hqa and Hqf, introduced by Nelan et al. The band- 
passes defining these indices are reproduced in Table Q and 
shown in Figure in comparison to some representative 
spectra for stars and galaxies. The two indices were defined 
with the practical intention of being measurable for as many 
objects as possible in the NFPS dataset. Within NFPS, only 
the spectra obtained at the WIYN telescope have sufficient 
coverage in the red to measure Ha, and even then, only for 
the most nearby galaxies. The major practical consideration 
was thus to ensure the red continuum was as close as pos- 
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Figure 1. Ha index definitions from Nelan et al. (2005). The 
upper shaded boxes indicate the feature and pseudo-continuum 
pass- bands for Hap; the lower boxes show the pass- bands for 
HctA- Comparison spectra show stars from the high-resolution 
library of Montes, Ramsey & Welty (1999), with spectral types 
F8V (top), G8III, K7V. Two galaxies are shown from NFPS, one 
with emission and the other apparently free from contamination. 

sible to the line. A second constraint is the presence of two 
[Nil] emission lines bracketing Ha, at 6548 A and 6583 A. 
Although it is possible to place the blue pseudo-continuum 
on the blue side of the 6548 A line, pushing the red pseudo- 
continuum beyond the 6583 A line is impractical given the 
spectral-range constraint. Two alternative definitions were 
proposed, with different treatments of the red continuum. 

The narrower index, Hap, has continuum bands placed 
to avoid the [N II] lines in galaxies with an emission com- 
ponent. Where emission is present, Hap will be weakened 
by infilling, or will become negative if the emission domi- 
nates. For pure-absorption objects, the sensitivity of Hap is 
likely to be reduced because the absorption spreads into the 
continuum band, especially at large velocity dispersion. The 
wider definition, Hqa has continuum bands better separated 
from Ha itself, and so provides a better indicator of stel- 
lar absorption. Where emission is present, both the central 
band and the red continuum band are contaminated. The 
net effect will depend on the ratio of [Nil] / Ha; given the 
LINER-like ratios typically observed in ellipticals (Phillips 
et al. 1986), the Ha emission is likely partially compensated 
by [N II] in most cases. 

For age determination, a key issue is to separate the hy- 
drogen line itself from the surrounding metallic lines. In the 
Ha region, the density of metal lines is much lower than for 
the blue Balmer lines. One potential contaminant arises from 
Cal at 6573 A, which falls in the central pass band of Hoa 
and in the red continuum of Hap. For emission-free galax- 
ies, the treatment of this feature is likely the most important 
difference between the two indices. The abundance of Ca in 
elliptical galaxies seems to track Fe, rather than following 
Mg as would be expected for an a-element (e.g. Thomas, 
Maraston & Bender 2003b; Cenarro et al. 2004, and ref- 
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Figure 2. Temperature dependence of the Ha a and Hap in- 
dices, as determined from the STELIB library stars. Dwarf stars 
(log g > 3.4) are plotted with filled symbols and solid line; gi- 
ant stars with open symbols and dashed line. The index values 
have been corrected to [Fe/H]=0 and log g = 3.4, to isolate the 
temperature terms. Each fit is a cubic in log T c g . 
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Figure 3. Metallicity dependence of the H«a and Hap indices, 
as determined from the STELIB library stars. Symbols and line- 
styles as in Figure [5] The index values have been corrected to 
log T c ff =3.7 and log g = 3.4, to isolate the metallicity term. The 
fits are linear in [Fe/H]. 



erences therein) . Thus although Ca 1 6573 A may introduce 
[Fe/H] contamination, this should not depend strongly on 
[a/Fe]. 

In conclusion, our Hoa and Hap indices are practically 
motivated, and were not tuned for optimal performance in 
population synthesis models. However, as we show in the 
following sections, the NFPS Ha indices appear to be ex- 
cellent age indicators, with very little residual sensitivity to 
the metallicity. 

2.2 Population synthesis 

To predict the Hqa and Hap index values for simple stel- 
lar populations, we first determine their responses to phys- 
ical properties of stars, then combine the contributions us- 
ing stellar properties along a given evolutionary isochronc, 
weighted by an assumed initial mass function. 

For this analysis, we use the STELIB spectral library of 
Le Borgne et al. (2003), which covers a metallicity range of 
—2.0 < [Fe/H] < +0.5 at a nominal spectral resolution of 3A. 
This library is well-matched to the NFPS native resolution 
(also ~3A). The Hoa and Hap indices were measured on the 
STELIB library spectra using the program indexf (Cenarro 
et al. 2001), as used also for the NFPS galaxy measurements. 

We derive separate fitting functions for the dwarf and 



giant sequences, separated at log g = 3.4 (Figures 
Since the indices are especially sensitive to temperature, we 
model them with a cubic in log T e g, and linear correction 
terms in log g and [Fe/H] (alternative parametrizations of 
the T e g dependence have been tested, without significant ef- 
fect on our conclusions). The stars allowed to contribute to 
the fits have temperatures below 7000 K (for 47 giant stars) 
or 9000 K (for 61 dwarfs) and metallicity [Fe/H]> -0.5. The 
stellar parameter range covered by the library and the fits 
limits the validity of our models to ages >1 Gyr and metallic- 
ities -0.5 < [Fe/H] < +0.3. At metallicities above twice-solar, 
the library coverage is very sparse, compromising predictions 
for the most enriched systems such as truly giant ellipticals. 
Normal early-type galaxies, with a < 200 kms -1 exhibit only 
modestly super-solar values, ([Fe/H] < +0.3) when measured 
from Fe-dominated features, rather than a-enhanced lines. 

To determine the Hqa and Hap that would be mea- 
sured on an integrated population of a single age and single 
metallicity, we sum the contributions from different stellar 
masses along the appropriate isochrone. For these calcula- 
tions, we adopt the Padova theoretical tracks, taken from 
Salasnich et al. (2000), with solar-scaled abundance ratios 
and metallicities Z = 0.008, 0.019, 0.040, 0.070. The contri- 
bution of each mass interval is weighted by the number of 
stars in the interval, assuming a Salpeter initial mass func- 
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Figure 4. Gravity dependence of the Hoa and Hop indices, as 
determined from the STELIB library stars. Symbols and line- 
styles as in Figure [5] The index values have been corrected to 
log T c ff=3.7 and [Fe/H]=0, to isolate the gravity term. The fits 
are linear in log g. Note that because gravity and temperature are 
strongly correlated for giants, the real variations in Ha at fixed 
temperature and metallicity are much smaller than the total range 
shown here. 



tion (IMF) N(M)dM oc M~ 2,35 , and by the R-band lumi- 
nosity of the star, as a proxy for the Ha continuum flux. 



2.3 Results 

Figure |S] shows the synthetic Hoa and Hof as a function 
of input age and metallicity. As usual with such models, 
relative changes in the indices are expected to be robust 
than the their absolute values. In particular, we can extract 
the following "responses" to the population parameters: 



dH 



a a 



d log Age 

9HaF 
d log Age 



-0.783 



,+0.100 
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dH 
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+0.150 



-0.683l° :£5 



d [Fe/H] 

<9Ho:f 
d [Fc/H] 



-0.161 



n 070+O.O86 



In practice these responses are derived from a simultane- 
ous linear fit to HcVA,F(log Age, [Fe/H]), over the subset 
of models with Age>3 Gyr. This fit reproduces the index 
predictions with an rms scatter of less than 0.025 A. The 
errors in the responses are estimated through a bootstrap 
algorithm, described below. 

Physically, the age dependence of Ha absorption arises 
primarily from the increased temperature, hence stronger 
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Figure 5. Index predictions for Hoa (solid lines) and Hap 
(dashed lines). The upper panel shows the indices as a function of 
age, for constant metallicity Z=0.008 (highest), 0.019, 0.040 and 
0.070. The lower panel shows index variations with metallicity, at 
constant age of 1.00 (highest), 1.78, 3.16, 5.62, 10.0, 17.8 Gyr. 



Balmer absorption, at the Main Sequence turn-off in younger 
populations. Our results are contrary to the claim by Cald- 
well et al. (2003), that Ha should be insensitive to age be- 
cause the R-band luminosity is dominated by the red giant 
branch, in which Ha absorption is weak and independent 
of age. The underlying Padova/Salpeter population model 
instead yields roughly equal flux-contributions from the gi- 
ants and the dwarfs. The ratio of Ha contributions for the 
youngest models is 25:75 (giant:dwarf), while the oldest 
models have around 50:50 ratios. 

The error estimates are derived from 1000 bootstrap 
realisations of the input stellar library generated by resam- 
pling (with replacement) from the initial input list. The aim 
is to propagate the uncertainties in the fitting functions, 
in particular due to sparse coverage of the stellar atmo- 
sphere parameter space by STELIB. Implicitly, the method 
assumes that coverage of the parameter space is incomplete, 
but is not systematically biased. For each realisation, new 
fitting functions are computed (according to the same cri- 
teria used above), and used to predict index values for the 
grid of stellar populations. The resulting distribution for the 
parameter responses is shown in Figure|S| Although most re- 
alisations yield results close to the default model, some ~ 3% 
lead to dramatically different parameter responses, with Ha 
absorption increasing with age. These cases arise when the 
two cool dwarf stars are both absent from the resampled 
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Figure 6. Parameter responses for the two Ho indices. The large 
circles indicate the best-fit values derived from the original syn- 
thesis; small crosses show responses generated by bootstrap re- 
sampling the input stellar library. Grey regions show the 68.3% 
(i.e. lcr) intervals for each parameter separately. 



library. In such cases, the cubic logT c ff term is badly under- 
constrained, yielding unphysically large Ha contributions 
from the lower Main Sequence. This does not imply a prob- 
lem in the default synthesis, since in this case, the library 
does adequately constrain the cubic fit. The "failure" of 
some bootstrap trials illustrates an important aspect of the 
bootstrap results, that each realisation samples the param- 
eter space less well than the original library. Our error es- 
timates are therefore conservative in this respect. In Sec- 
tion [jTU the bootstrap results will be further propagated to 
constrain the age and metallicity scaling relations from the 
observed Ha— a relations. 

Our models are constructed on the assumption of solar- 
scaled chemical abundances, while in fact elliptical galaxies 
typically have [a/Fe] =0.1-0.3 (e.g. Kuntschner et al. 2001). 
Thomas et al. (2003a, 2004a) have generalised linestrength 
prediction models to allow for non-solar abundance ratios. 
These models involve corrections based on stellar theoretical 
atmosphere calculations (e.g. Tripicco & Bell 1995) for the 
contribution of various metallic absorption lines. They find 
that Hr5 and H7 are highly sensitive to [a/Fe]- variations, 
since the hydrogen lines are embedded in a forest of metal 
lines which contribute to the measured index. By contrast, 
the H/3 line lies in a spectral region with a lower density 
of metal lines, and is almost insensitive to [a/Fe]. As dis- 
cussed in Section 12.11 high- resolution stellar spectra reveal 



very clean continuum regions around Ha (Figure 0. The 
Cal 6573 A absorption line likely does not introduce strong 
[a/Fe] dependence, given the anomalous behaviour of Ca 
relative to other a-elements. Thus, although we cannot read- 
ily determine the [a/Fe]-response of the Ha indices denned 
here, it seems unlikely that [a/Fe]-variations seriously com- 
promise relative age measurements using HaA- 

Taking the ratio of the parameter responses, we find 
that HaA is / ~ 4.8 times more sensitive to a decade change 
in age, than to a decade change in metallicity, while for 
Hap the equivalent ratio is / ~ 1.9. For comparison, in the 
Thomas et al. (2003a, 2004a) models, the standard Lick H/3 
index has / « 1.7, while the H7a,p and H<5a,f indices all 
have / ~ 1.0. The bootstrap simulations reveal substantial 
uncertainty in the metallicity response, and hence /, so it is 
premature to conclude that Hoa offers a dramatic improve- 
ment over the more widely used features. However, the two 
Ha indices, and especially Hoa, appear promising in this 
respect, and worthy of future study to constrain further the 
metallicity dependence. 



3 A STRONG AGE-MASS RELATION ALONG 
THE RED SEQUENCE? 

3.1 Comparison to NFPS data 

In this section, we compare the predicted Ha absorption to 
observed values reported by Nelan et al. (2005) from the 
NFPS. We use the "full-resolution" (~3A FWHM) mea- 
surements, corrected for aperture and velocity-broadening 
effects. The NFPS galaxy sample was selected by apparent 
magnitude (i? To t < 17) and colour (A(B - R) > -0.2), 
relative to the red-sequence "ridge" in each cluster (Smith 
et al. 2004). There is no cut on the red side of the ridge. 
Note especially that we do not explicitly select by galaxy 
morphology; thus some of the galaxies included in this sam- 
ple would not be present in a hand-picked set of bona fide 
ellipticals and SOs. Moreover, the NFPS sample explicitly 
selects against any massive ellipticals with relatively blue 
colours. It is for this reason that we refer always to red- 
sequence galaxies, rather than early-type galaxies. Although 
the full NFPS database includes linestrength data for ~4000 
galaxies, Ha is beyond the range of most of the spectra. 
Only 689 confirmed cluster members, spanning a redshift 
range cz = 3600 — 12700 kms -1 , have measurements for 
Hoa, Hof, plus necessary auxiliary data (velocity disper- 
sions and emission-line estimates). The redshift range of 
the Ha dataset is less than the median depth of NFPS 
(15000 kms -1 ). Because the NFPS galaxy sample is flux 
limited, and Ha is observed for only the nearby objects, 
the typical mass of the Ha sample is slightly fainter than 
that of the whole survey. 

As discussed in the Introduction, a major concern with 
Ha is that the age-sensitive stellar absorption may be con- 
taminated by nebular emission, both at Ha itself, and in the 
neighbouring [N II] lines. It is critical therefore to work with 
an subsample of galaxies in which nebular contamination is 
minimized. Nelan et al. (2005) determined emission equiv- 
alent widths for the [OIII] line at 5007 A, and for H/3, by 
fitting and dividing an appropriate stellar continuum. For 
the present analysis, we exclude 250 galaxies with emission 
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Table 2. Predictions for Ha a an d Hap indices as a function of age and metallicity. Each line represents a constant metallicity track; 
each column a track of constant log(age), with age in Gyr, as indicated by the column headings. 





[Fe/H] 


9.0 


9.1 


9.2 


9.3 


9.4 


9.5 


9.6 


9.7 


9.8 


9.9 


10.0 


10.1 


10.2 


10.3 


Ha A 


-0.38 


3.305 


2.945 


2.705 


2.559 


2.346 


2.209 


2.165 


2.080 


1.995 


1.927 


1.798 


1.717 


1.687 


1.672 




0.00 


3.207 


2.824 


2.659 


2.523 


2.368 


2.196 


2.095 


2.006 


1.942 


1.863 


1.774 


1.694 


1.614 


1.547 




+0.32 


3.015 


2.736 


2.562 


2.439 


2.305 


2.161 


2.042 


1.965 


1.898 


1.812 


1.723 


1.648 


1.575 


1.524 




+0.56 


2.838 


2.615 


2.448 


2.321 


2.176 


2.063 


1.971 


1.939 


1.849 


1.781 


1.695 


1.639 


1.586 


1.359 


H«p 


-0.38 


2.030 


1.876 


1.735 


1.666 


1.521 


1.427 


1.400 


1.336 


1.268 


1.215 


1.088 


1.019 


1.011 


1.009 




0.00 


2.092 


1.853 


1.732 


1.638 


1.522 


1.384 


1.309 


1.234 


1.181 


1.113 


1.035 


0.963 


0.889 


0.826 




+0.32 


1.992 


1.794 


1.648 


1.556 


1.447 


1.325 


1.225 


1.160 


1.101 


1.023 


0.940 


0.872 


0.801 


0.750 




+0.56 


1.852 


1.682 


1.541 


1.434 


1.303 


1.201 


1.122 


1.096 


1.013 


0.949 


0.865 


0.812 


0.760 


0.582 



at H/3 and/or at [OIII], based on a la detection limit. This 
rejection scheme is much stricter than the fixed thresholds 
imposed by Nelan et al. To ensure sufficient data quality, 
we impose a further selection for Hqa errors less than 0.3 A. 
After applying these cuts, the sample comprises 410 galax- 
ies. 

The Ha— a relations for this low-emission subsample 
are shown in Figure □ The data are fit using an iterative 3a 
clipping, with the effect of rejecting a further four outliers 
in Hqa and eleven in Hof. For the median a ~ 125 kms -1 , 
the fitted values are Ha A «1.69A and Ha P Ril.06 A. Taking 
both data and models at face value, these indicate average 
ages of 10-13 Gyr (Hqa) or 7-9 Gyr (Hq f ), if 0.0 <[Fe/H]< 
0.3. Potential causes of this discrepancy are discussed in 
Sections 13.41 and 13.51 

Differential observations, e.g. the change in Ha absorp- 
tion with galaxy mass, are expected on general grounds to 
be more robust than absolute values. The observed Ha a — a 
relation has slope of —0.75 ± 0.07 A per decade in a. For 
Hckf the equivalent slope is —0.70 ± 0.07. In the remainder 
of this section, we attempt to interpret these slopes first in 
terms of age and metallicity variations, and then consider 
alternative explanations. 



3.2 Age or metallicity variations 

Given the measured gradient of Hoa — a (or Hap- a) and a 
pair of values summarizing the index response to age and 
[Fe/H], we can constrain the exponents {u,/3) of scaling re- 
lations such that Age oc a a and Fe/H oc a 13 . Naturally, since 
only one gradient is measured, the viable values for a and (3 
fall along a degenerate linear track, along which the metal- 
licity and age effects combine to produce the observed Ha 
slope. By setting each of (a, (3) to zero in turn, we can de- 
termine the age and metallicity trends required if only one 
of these parameters varies with a. 

If we naively adopt simply the "best" parameter re- 
sponses from the population synthesis of Section [5] and al- 
low only for the errors in the observed Hoa— a relation we 
would require either Age oc a °- 95±0 - 09 (if metallicity is con- 
stant) or Fe/Hoc <j 4 ' 6±0 ' 5 (if a g e j s constant). From Ha F 
the equivalent slopes are (a, /3)=(1.02, 2.5) with compara- 
ble errors. The metallicity trend required is far in excess 
of determinations using Lick indices (e.g. Kuntschner et al. 
2001, who similarly constrained their fits by assuming no age 
variation, reporting Fe/Hoc <r~ 10 ). The age interpretation 
implies a factor of five in age over the 60-300 km s -1 range 



in velocity dispersion, which is also surprisingly strong (but 
see Section QJ. 

A similarly strong age-mass relation has been claimed 
by Nelan et al. (2005) , based on traditional Lick indices from 
NFPS. Their reported scaling relations are Ageoc <7°' 67 ± al5 
and [Fe/H] oc a 0A7±0 - 05 , Allowing both parameters to vary 
according to the Nelan et al. relations, we would expect an 
Hoa slope of —0.52 A from age and an additional —0.08 A 
from [Fe/H] variations, totalling —0.60 ± 0.12 A per dex, 
marginally consistent with the observed —0.75 ± 0.07. The 
Nelan et al. analysis is independent of the present work, in 
the sense that it does not employ Ha absorption, and uses a 
much larger sample of ~3500 galaxies. However, the sample 
characteristics are similar, since both works are based on the 
same survey. 

An independent study of 124 galaxies by Thomas et al. 
(2004b), also based on traditional Lick indices, and analysed 
using the same models as Nelan et al., yields Ageoc <r~ ' 25 . 
The Thomas et al. scalings predict a Hoa slope of —0.27 A 
(—0.18 A from age and —0.09 A from metallicity), appar- 
ently quite incompatible with our observed relation. Com- 
pared to the sample used here, the Thomas et al. compila- 
tion is weighted towards more massive objects (a > 100 kms, 
with median of ~200kms _1 ), and also to fairly clean E or 
SO morphologies. Both the mass-range and the morpholog- 
ical mix could influence the results obtained, as could the 
explicit colour-magnitude selection. Nelan et al. commented 
on an apparent steepening of the relation for low-mass sys- 
tems, which could account for some of the disagreement with 
Thomas et al., and comparable studies. Moreover, there are 
indications that NFPS galaxies with stronger disks show 
a steeper age-mass relation than pure ellipticals or bulge- 
dominated SOs. Both effects act in the sense which could 
explain the discrepancy between NFPS and Thomas et al. 
Finally, the NFPS colour-magnitude selection excludes any 
massive 'blue ellipticals'. If included, and if present in suffi- 
cient numbers, such objects would tend to flatten the age- 
mass relation. 

Caldwell et al. (2003) studied a sample of 175 early- 
type galaxies with a — 50 — 300 kms -1 , similar to the 
range probed in this paper. Using their favoured index com- 
bination, Caldwell et al. derive a steep age-mass slope of 
Ageoc cr~ 10 (see their Figure 21 and Table 9), in agreement 
with Nelan et al. and with our Ha results. Again, this seems 
argue for a dependence on the mass-range, with giant ellipti- 
cals favouring a flatter age-mass relation, and low-cr objects 
yielding a steeper slope. 
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Figure 7. Observed Ha— log a relations from NFPS data of 
Nelan et al. (2005). Galaxies from the low-emission subsample 
(see text), with Hoa errors less than 0.3 A, are shown as solid 
points with error bars. Open squares mark galaxies rejected from 
the final fit (solid line) by an iterative 3-cr clipping process. Small 
crosses indicate galaxies failing either the low-emission selection 
or the low-error selection (or both). In each panel, the dashed 
lines shows the slope predicted for Ho— log a using the age and 
metallicity scaling relations derived by Nelan et al. from tradi- 
tional Lick indices. The scaling relations of Thomas et al. (2004b) 
result in predicted slopes shown by the dotted line. 



3.3 Effect of synthesis model errors 

As already hinted at, the above discussion neglects an impor- 
tant source of error, arising from uncertainties in the popu- 
lation synthesis model itself. These errors can be included in 
the constraints by propagating the bootstrap-response dis- 
tribution of Figure |S| Recall that each bootstrap-realisation 



of the input stellar library was used to determine a new esti- 
mate of the parameter responses. Most of the bootstrap real- 
isations yield responses close to the "best" values, and hence 
yield (a, /3)-constraints similar to the picture described in 
the previous section. On the other hand, some bootstrap 
points show strong [Fe/H] responses (e.g. around -0.5 A in 
Hoa per dex Fe/H). These clearly improve the chances of 
observing a strong Ha a— a trend with neither a strong age- 
trend nor an excessive [Fe/H]— a scaling. Finally, the spray 
of "anomalous" bootstrap results, due to realisations which 
lack cool dwarfs, assigns some (low) probability to quite un- 
expected (a, f3) pairs. 

Combining these simulations, Figure |H| shows the likeli- 
hood function £(a, (3) allowing for the bootstrap errors and 
also for measurement error in the observed Ha— a slopes. As 
expected, the range of viable scaling relations is considerably 
expanded when the model errors are included. Formally, for 
a constant-metallicity sequence, we obtain Ageoc <j°" ±0 12 
from Hqa- In the constant-age case, the Fe/H— a relation 
has a slope of f3 > 1.5 at la (/3 > 1.2 at 2a), with no mean- 
ingful upper limits. Thus if a constant age is imposed, the 
implied metallicity trend remains much steeper than values 
favoured by Lick-index studies. Allowing both parameters to 
vary, the Nelan et al. (2005) scalings lie at the la contour. 
The scalings of Thomas et al. (2004b) remain disfavoured at 
the ~3cr level. 

Two curious aspects of Figure|8]are worthy of brief com- 
ment. First, the change in shape of the likelihood contours, 
which now form a closed curve at la suggests that allowing 
for the bootstrap errors has actually tightened the constraint 
on the metallicity-gradient taken separately. However, if we 
integrate over the age-gradient as a nuisance parameter, to 
obtain the one-parameter constraint, the vertical "flaring" of 
the likelihood function compensates for this effect. Second, 
because the likelihoods were derived from only 1000 boot- 
strap realisations, the 3a contour is not well-determined. 
The noisy features at the lower-left of the Hqa panel re- 
sult from the anomalous bootstrap solutions, i.e. from the 
upper right of Figure HJ The 3a contour is retained in the 
plot mainly to show this effect, and to illustrate the discrete 
tracks which contribute to C(a,f3). 

3.4 Nebular emission contamination 

The greatest obstacle to the use of Ha absorption as a stellar 
age indicator is its potential for contamination by nebular 
emission, both at Ha itself, and in the neighbouring [Nil] 
lines. In Section 13. II we used the emission measurements at 
H/9 and [O III] to reject the objects most strongly affected. 
This approach lacks precision for very low-level contamina- 
tion, since the H/3 emission is weaker than Ha. Moreover, 
the H/3 measurements are themselves made in the presence 
of imperfectly-known stellar absorption, potentially leading 
to underestimated equivalent widths. 

As a result, our "low-emission" subsample will include 
some galaxies with low-level contamination. In general, weak 
emission will yield smaller values for Hqf, causing galaxies 
to appear older. In the lower panel of Figure [7| the scatter 
of outliers below the mean Har-u relation suggests such 
an effect, with a few galaxies showing negative Hap values, 
i.e. net emission. For Hoa the situation is more complicated, 
since [N II] in the red continuum region leads to increased in- 
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Figure 8. The upper panel shows C(a,f3), the likelihood of the 
observed Hoa gradient, given scaling relations Ageoc cr a and 
Fe/H oc a 13 . In addition to the observational errors on the Hoja - u 
slope, the likelihood explicitly accounts for synthesis model errors, 
using the bootstrap simulations. Bold contours show the resulting 
Ict, 2a and 3cr confidence regions for two parameters. For compar- 
ison, dashed contours show the equivalent limits if model errors 
are neglected. The labels 'N' and 'T' refer to the scaling relations 
from Nelan et al. (2005) and Thomas et al. (2004b) respectively. 
The lower panel shows the equivalent results for Hap. 



dex values, which may explain the absence of such dramatic 
outliers in the Hoa — a relation. This compensating influ- 
ence depends on the ratio of [N II] to Ha emission, which is 
strongly correlated with absolute magnitude (Phillips et al. 
1986). Most of the outliers in Hap are fairly high mass ob- 
jects, where the [N II] compensation will be most efficient. In 



this case however, it is surprising that they are not detected 
at [O III] which should also be strong in such objects. 

Although clipping of deviant points can remove a small 
number of outliers, a more serious concern is whether the 
mean relation itself might be driven by variations in emis- 
sion characteristics, rather than age or metallicity. One test 
for this is to repeat the fits, allowing for terms in H/3 and 
[O III] emission, as well as velocity dispersion. In these fits 
the H«a— <? and H«f— a slopes are very little changed, and 
the emission-line coefficients are not significant. To generate 
the observed Ha— a trends with emission variations would 
require increasing Ha emission with increasing mass (oppo- 
site to the behaviour at H/3). The similarity of results from 
Hqa and HaF is also difficult to explain if the Ha— a slopes 
are mainly due to emission: because the two indices respond 
differently to [N II], we would need either very low [N II] /Ha 
ratio throughout, or else very little variation in this ratio 
as a function of mass. Either option is contrary to the be- 
haviour observed in galaxies with strong emission (Phillips 
et al. 1986). Moreover, the amount of emission would have 
to be very precisely determined at any given mass, in order 
to retain a tight Ha— a relation at all. 

A related concern is whether the H/3-based selection it- 
self biases the sample against low-mass weak-absorption ob- 
jects. This is conceivable, since separating emission and ab- 
sorption components of H/3 is more difficult at low-cr (Nelan 
et al. 2005). The signature of such a bias would be an in- 
creased proportion of emission-rejected galaxies below the 
fit, at small a. The effect is not readily apparent in Fig- 
ure Q Running the fits with a variety of alternative emis- 
sion rejection schemes, including that of Nelan et al., the 
Hoa gradients are stable at the 15% level. For Hap, the re- 
sults are less stable, yielding steeper slopes when the looser 
selection criteria are used. 

Finally, emission does not easily account for the abso- 
lute age discrepancy between Hoa and HaF. Recall that 
Hoa yields older ages at given a than Hap. The emission 
effect should cause both indices to yield older ages, but Hoa 
should be less strongly affected, as a result of the [N II] com- 
pensation. 

In conclusion, emission contamination remains a very 
serious concern for Ha absorption studies. Emission un- 
doubtedly generates outliers from the observed relations, 
some of which cannot easily be excluded using other observa- 
tions. On balance however, it appears unlikely that emission 
variations are responsible for the strong, decreasing trends 
of HaA and Hap with velocity dispersion. 

3.5 Alternative explanations 

As well as sensitivity to nebular emission, our result is sub- 
ject to a number of other caveats, relating to assumptions 
and limitations of the synthesis model. The following para- 
graphs consider three such concerns: Blue Horizontal Branch 
(BHB) stars, non-solar abundance ratios and IMF slope vari- 
ation. 

The Balmer lines measure the luminosity- weighted tem- 
perature of a stellar population, so the Hoa trend could 
be generated by any hot stellar component which becomes 
stronger at low a. Maraston & Thomas (2000) and Thomas 
et al. (2004b) have explored models with BHB populations, 
and shown that the strong H/3 absorption of some ellipti- 
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cals might indeed arise from BHB stars rather than from 
younger ages. Because these stars have blue continua, they 
contribute more strongly to the higher-order lines than to 
H/3 (Schiavon et al. 2004), and their impact will be weak- 
ened further at Ha. Quantitative modelling of the relative 
responses should offer one approach to constraining the im- 
portance of BHB stars in old populations. 

We have argued that non-solar abundance ratios are un- 
likely to have a strong effect on Ha, since there are no strong 
metallic features within the band-passes used by our indices. 
The only feature of concern is the Ca 1 6573 A line, which 
affects the Hqa and Hqf indices in opposite senses. The 
similarity of the scaling relation slopes obtained from the 
two indices suggests that variations of Ca abundance with 
a do not strongly influence the results. On the other hand, 
the difference in absolute ages implied by the two indices 
could be related to this line. In particular, if Ca is under- 
abundant for the whole sample (relative to STELIB), then 
Hoa will be anomalously low (leading to older ages), while 
Hqf will be larger (yielding younger ages). Thus the sense 
of the effect is as observed. However, to force Hqa and Hap 
to give consistent ages at given a, we need a large metallic- 
ity difference, of > 0.3 dex. On balance, simple differences 
in flux-calibration or resolution-matching seem more likely 
explanations for the 0.1 A offset required to resolve the ab- 
solute age inconsistency. 

Finally, we have explored the effects of changing the 
assumed IMF slope in the synthesis. For power-law expo- 
nents in the range 1.0-3.5 (cf. default 2.35), the predictions 
for Hqa and Hof vary by < 0.3 A. Thus the full range of 
~0.75 A cannot be generated by plausible levels of variation 
in the IMF slope. 



4 DISCUSSION 

Having reviewed some of the potential systematic effects, in 
this section we consider whether a steep age-mass relation 
is compatible with other constraints, focusing on observa- 
tions of cluster galaxies at lookback times of a few Gyr. 
In what follows, we will adopt the HaA-derived relation of 
Ageoc a°- 95±0A2 , which assumes no metallicity variations. 
Considering that Lick index studies suggest a mix of age 
and metallicity effects, this slope should be regarded as an 
upper limit. 

Since the absolute ages implied by the Ha synthesis are 
unacceptably old, we fix the age zero-point by forcing the 
most massive galaxies to the age of the universe. (Note that 
if Hap were used instead, this rescaling would not be neces- 
sary, and all other results would be essentially unchanged.) 
In this section we adopt the WMAP cosmological param- 
eters, (/i,n m ,n A ) = (0.71,0.27,0.73) (Bennet et al. 2003), 
yielding a maximum age 13.7 Gyr. In this case, the age-mass 
relation implies a mean age of only 3 Gyr for the smallest 
objects, i.e. with a ~ 60kms -1 , while the observed scat- 
ter around the Hqa-t relation allows for an intrinsic age 
range of 2-5 Gyr for these galaxies. The age-mass relation 
is shown in Figure[5] together with some of the other results 
discussed below. In all cases, the "age" or "formation red- 
shift" is to be understood as a luminosity- weighted mean age 
for the stars and does not imply the galaxy assembly time. 
At times earlier than this "age", the galaxy may have been 



present in the cluster but with the bluer colours indicative 
of a star-forming system. 

The steep Hqa- a trend then appears to favour a pic- 
ture in which galaxies now at the faint end of the red 
sequence ceased forming stars only at modest redshifts, 
z < 0.3. A testable implication is that the red sequence in 
high redshift clusters should be deficient in faint galaxies, 
with the depletion mass increasing with increasing redshift. 
Recent observations of z ~0.8 clusters have indeed suggested 
that the red sequences are depleted at the faint end (de Lu- 
cia et al. 2004; Goto et al. 2005). Converting their observed 
magnitudes to a velocity dispersion (after accounting for 
k-correction and passive evolution), the faintest non-star- 
forming objects in these clusters have a ~ I50kms~ , at 
a look-back time of ~7Gyr. For comparison, the age-mass 
relation derived in this paper also places the current age 
of such galaxies at ~7Gyr. Intriguingly, there is evidence 
for red-sequence truncation at much smaller lookback times. 
Smail et al. (1998) observed clusters at z ~ 0.24 (lookback 
time of only ~3Gyr); their Figure 5 shows a marked cut- 
off in the red sequence, at a magnitude corresponding to 
a ^llOkms^ 1 . For comparison, our age-mass relation sug- 
gests ages of ~5Gyr for such objects. A rigorous comparison 
of these works is not trivial, since they do not use consistent 
definitions for the "truncation magnitude". However, there 
is very suggestive agreement with our Hqa results, both in 
terms of the absolute age of the faint red sequence galaxies, 
and in the trend for fainter objects joining the red sequence 
at lower redshifts. 

Additional support comes from the rapidly-evolving 
characteristic mass scale of E+A (or post-starburst) galax- 
ies, as determined by Tran et al. (2003). Such galaxies can 
be interpreted as objects which have just ceased forming 
stars, and are in the process of fading onto the red se- 
quence. The results of Tran et al. suggest that E+A galax- 
ies observed at z ~ 0.8 (lookback time 7 Gyr) will fade to 
a ~ 170 km s" 1 red -sequence objects today, while those in 
the E+A phase at z ~ 0.3 (3.5 Gyr) will fall onto the red se- 
quence at a ~ lOOkrns -1 . Again, there is a suggestive level 
of agreement with our Hqa results: the slope of the E+A 
mass evolution itself suggests Agetx a 1 ' 3 , slightly steeper 
than our result. The E+A "ages" fall ~lGyr younger than 
our relation would predict. Qualitatively this is expected, 
since the E+A phase signals the last star-formation episode, 
while our relation describes the mean stellar age, which is 
necessarily older. 

In conclusion, local galaxy studies which suggest rapid 
recent evolution in the red sequence population (Caldwell 
et al. 2003; Nelan et al. 2005; this work) are compatible 
with direct observations of galaxy populations at intermedi- 
ate redshifts. It is not, of course, implied that the faint red 
galaxies were themselves assembled at such recent epochs, 
but rather that star formation ceased at that time, perhaps 
following expulsion of remaining gas, or its rapid consump- 
tion in a starburst. On the other hand, it may be a challenge 
to reconcile such recent star formation with the tightness of 
the colour-magnitude relation itself (e.g. Bower, Lucey & 
Ellis 1992), without invoking age-met allicity "conspiracy" 
models. 
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Figure 9. Formation history of red-sequence cluster galaxies. 
The shaded region shows the mean age-mass relation Age oc 
CT 0.95±0.12^ ag SU ggested by the Ha\—a relation of this paper. 
Plausible metallicity variations could flatten this relation by 10— 
40% (see Figure |SJ. The ages are calibrated by fixing the high- 
mass galaxies to the age of the universe, 13.7 Gyr. The results of 
Nelan et al. (2005), using multiple Lick indices for ~3000 NFPS 
galaxies are shown as the filled circles. Open circles show the 
characteristic velocity dispersion of E+A galaxies in clusters from 
z=0.3 to z=0.8 (Tran et al. 2003). These results may be compared 
to the low-redshift age measurements if the E+A objects are in- 
terpreted as newly-quiescent objects fading onto the red sequence. 
Finally, crosses indicate the mass scale corresponding to faint-end 
truncation of the red sequence in clusters at z=0.24 (Smail et al. 
1998), z=0.75 (de Lucia et al. 2004) and z=0.83 (Goto et al. 2005). 



the scaling relations of Thomas et al. (2004b) significantly 
under-predict the slope. 

(v) If the Ha— a slope is due purely to age, the derived 
age-mass relation implies ages of only a few Gyr for low- 
luminosity red-sequence galaxies in clusters. Interpreted as 
the time at which such galaxies ceased forming stars, the 
ages broadly agree with an observed deficit of faint red- 
sequence members in clusters at z ~ 0.7. 

In summary, Ha absorption is a potentially powerful, 
but neglected, probe of early-type galaxy formation epochs. 
Future work with improved stellar libraries should aim to 
constrain further the metallicity dependence of Ha indices, 
and determine optimal passbands for index definitions. The 
analysis in this paper supports a steep age-mass relation 
along the red sequence of cluster galaxies. Although the most 
massive cluster ellipticals are indeed very old, our work sug- 
gests that the faint red-sequence galaxies became quiescent 
as recently as z < 0.5. Qualitatively, this "downsizing" sce- 
nario (cf. Cowie et al. 1996) agrees with recent detections of 
a faint-end depletion of the red sequence at higher redshifts, 
and with the evolution of E+A galaxy masses over the past 
~7Gyr. 
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5 CONCLUSIONS 

We have investigated the behaviour of Ha absorption indices 
in single-age, single-metallicity stellar populations, and com- 
pared to observed trends in red-sequence cluster galaxies. 
The principal conclusions are: 

(i) The Ha absorption strength, measured using Hoa or 
Hof, appears promising as a probe of stellar ages in inte- 
grated spectra, with only weak response to metallicity vari- 
ations. The indices are expected to be more robust against 
non-solar [a/Fe] ratios than the high-order Balmer lines. 

(ii) The principal drawback to the Ha method is its sen- 
sitivity to nebular emission contamination, which can only 
be minimized through careful sample selection. 

(iii) At face value, the slope of the observed Hoa — <x re- 
lation, from NFPS, requires a factor of ~5 in age over the 
observed 0.7 dex range in velocity dispersion, with younger 
ages for lower-mass objects. A pure metallicity interpreta- 
tion requires A[Fe/H]> 1.1 dex over the same range, incon- 
sistent with other constraints. 

(iv) The combined, more modest, age and metallicity 
trends reported by Nelan et al. (2005) for NFPS reproduce 
the observed Hqa- a slope within the errors. By contrast, 
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